A Machine Learning Framework for Automatically Annotating Web Pages with Simple HTML Ontology Extension (SHOE)
نویسندگان
چکیده
With enormous amounts of information injected into the Internet every second, manual maintenance of the knowledge base on the Internet is a hopeless task. A reasonable remedy for this problem is to create a “machine understandable” Internet. To achieve this, Heflin et al. proposed an HTML-based knowledge representation language called Simple HTML Ontology Extension (SHOE). SHOE can be used in many application domains, but it requires users to manually annotate the web pages. To overcome the shortages of SHOE, we created a machine learning framework called AutoSHOE for automatically annotating web pages with SHOE annotations. With this framework, users can easily collect SHOE-annotated pages as training data, experiment with different feature selection methods and learning algorithms to find the best approach for learning a particular ontology, and automatically annotate new web pages with trained classifiers and rule sets. In addition, AutoSHOE allows new feature selectors and learners to be easily plugged into the system and run anywhere through the web. We present the AutoSHOE architecture and then discuss experimental results of our proof-of-concept design.
منابع مشابه
Automated Ontology Learning for a Semantic Web
By expressing web page content in a format that machines can understand, the semantic web provides huge possibilities for the Internet and for machine reasoning. Unfortunately, there is a considerable distance between the present-day World Wide Web and the semantic web of the future. The process of annotating the Web to make it semantic web-ready is quite long and not without resistance. In thi...
متن کاملOntology-Based Knowledge Discovery on the World-Wide Web
This paper describes SHOE, a set of Simple HTML Ontology Extensions. SHOE allows World-Wide Web authors to annotate their pages with ontology-based knowledge about page contents. We present examples showing how the use of SHOE can support a new generation of knowledge-based search and knowledge discovery tools that operate on the World-Wide Web.
متن کاملDynamic Ontologies on the Web
We discuss the problems associated with managing ontologies in distributed environments such as the Web. The Web poses unique problems for the use of ontologies because of the rapid evolution and autonomy of web sites. We present SHOE, a web-based knowledge representation language that supports multiple versions of ontologies. We describe SHOE in the terms of a logic that separates data from on...
متن کاملKnowledge discovery – semantic web
Knowledge management is a process which comprises knowledge discovery, knowledge collection, knowledge organization and knowledge process. Among these four process knowledge discovery is integrated with semantic web for enhanced information retrivel. Knowledge discovery is the process of automatically searching large volume of data for patterns that can be considered knowledge about the data. T...
متن کاملCoping with Changing Ontologies in a Distributed Environment
We discuss the problems associated with versioning ontologies in distributed environments. This is an important issue because ontologies can be of great use in structuring and querying intemet information, but many of the Intemet’s characteristics, such as distributed ownership, rapid evolution, and heterogeneity, make ontology management difficult. We present SHOE, a web-based knowledge repres...
متن کامل